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Description 

This invention relates to DASD storage subsystems, and more particularly, to methods and means tor managing 
spare DASD array capacity so as to optimize array operations in fault tolerant, degraded, and data rebuild modes. 

5 

Arrays, Effect of Redundancy; Reading and Writing 

In the prior art, it is known to read and write data + parity (as defined over the data) from and to a synchronous 
array of N data + P parity DASDs. The DASD array arrangement increases the data rate by rate of a single DASD 
10 and increases logical track size by N * single DASD track length. Reference can be made to Patterson et al, "A Case 
For Redundant Arrays Of Inexpensive Disks (RAID)", Report No. UCB/CSD 87/391, December 1987, Computer Sci- 
ence Division, U. of California, Berkeley. 

It is also known that writing a data string to a DASD array includes segmenting data into N blocks (termed striping), 
determining parity over the blocks, and storing the N data + P parity blocks at a synchronous address on counterpart 
failure independent DASDs. Likewise, reading a data string from an array involves copying N+P addressed blocks 
from a synchronous address on counterpart DASDs into a buffer, concatenating them, checking parity, and serially 
transporting the concatenated blocks (string) from the buffer to the accessing CPU. 

DASD Failure and MTBF 

20 

It is well recognized that aggregating DASDs into arrays decreases the mean time between DASD failure. However 
the combined use of redundant information (parity), dynamic substitution of formatted spare DASDs for failed ones, 
and reconstruction of missing data onto the substituted spare, substantially increases the mean time between data 
unavailability by orders of magnitude. This is described in Parket al, "Providing Fault Tolerance In Parallel Secondary 
25 Storage Systems", Princeton University Report CS-TR-057-86, November 1 986 and Dunphy et al, US Pat. 4,91 4,656, 
"Disk Drive Memory", issued April 3, 1990. 

Two Usages of Parity Groups 

30 The term "parity group" has acquired data oriented and storage oriented usages. In the data usage, "parity group" 

signifies a predetermined number of logically associated data blocks + a parity or equivalent redundant information 
defined over or derived from the data blocks. In the storage usage, "parity group" signifies a predetermined number 
of logically associated physical storage locations whose data contents determine the value of the redundant information. 

35 Parity Spreading and DASD Arrays 

European application EP 469 924 ("Method and Means for Managing DASD Array Accesses When Operating In 
Degraded Mode", inventors - Mattson and Ng) uses parity group as a logical association of data blocks as applied to 

Patterson's RAID 5 type DASD array. In Mattson, the data and storage boundaries of the parity group were not nec- 
40 essarily coextensive. The only limitation was that no two segmented blocks from the same data parity group be written 
onto the same DASD. 

In contrast, Dunphy et al, US Pat. 4,914,656, "Disk Drive Memory", issued April 3, 1990, defines parity over data 
as in Mattson. However, in Dunphy the storage boundaries are the same as that of the data parity group. They are 
maintained even in the presence of failure since a spare DASD substitutes for the failed DASD and missing data is 

45 rebuilt and rewritten onto the substituted spare. 

Clark et al, US Pat. 4,761,785, "Parity Spreading Enhanced Storage Access", issued August 2, 1988 discloses a 
non- synchronous DASD storage subsystem in which parity groups are defined over a range of storage and where 
data boundaries are NOT necessarily coextensive with storage boundaries. In Clark et al, each data string is written 
into consecutive locations of a "storage" parity group. If the data string size exceeds the capacity of the group, then 

50 the residue of the data string is written into another "storage" parity group. If the data string is smaller than the group, 
then the remaining space may well be occupied by other data strings. Parity is taken across information occupying 
logically associated extents (range of address locations) of the DASDs forming the group. 

Distributed Parity and Virtual or Distributed Sparing 

55 

Clark et al also taught that the location of the parity blocks for counterpart storage parity groups could be distributed 
across DASDs in the subsystem with the only limitation that not all of the blocks be written on a single DASD. This was 
contrary to Ouchi, US Pat. 4,092,732, "System for Recovering Data Stored In A Failed Memory Unit", issued May 30, 



2 



EP 0 518 603 B1 



1978, and Dunphy et al where parity is written to dedicated DASDs. 

European patent application EP 462 917 ("Method and Apparatus for Recovering Parity Protected Data", inventors 
- Bond et al) teaches the use of a virtual spare among non-synchronous DASDs where parity groups are defined across 
storage, and data and storage boundaries are not necessarily coextensive. 
5 In Bond et al, the CPU can read and write to a logical spare DASD. The logical addresses are then distributed in 

non-specific nnanner across one or nnore real DASDs. More particularly, Bond teaches that the locations of parity blocks 
distributed as in Clark may be overwritten as if they were spare locations in the reconstruction of data being accessed 
after a DASD has failed. Usually, this means the parity block of the parity group covering the lost data. Bond extends 
this notion to the use of other spare or nonessential locations among the DASDs. 

10 

Array Performance Limitations Using Bond et al Type Distributed Sparing 

In a Bond etal type of distributed sparing via writing reconstructed data into the group parity location, once recovery 
is complete, the system operates without parity This is inimical to a highly available, fault-tolerant system. Alternatively 
1^ Bond et al reserves spare blocks on different DASDs. When a DASD fails, data is recreated and written onto spare 
blocks on surviving DASDs. There are a number of problems inherent with this form of distributed sparing: 

a. two or more data blocks from the same storage parity group are written on the same DASD. If a DASD were to 
fail now, the DASD array subsystem would lose data, which is unacceptable. 

20 

b. data blocks of the same group formerly located on different DASDs now being resident upon the same DASD 

cannot be read in parallel. 

c. lowered throughput because of resource sharing among competitive processes occasioned by DASD arm steal- 
26 ing between the reading of the surviving data blocks to compute any missing data or parity block and the writing 

of a reconstructed data or parity block. 

These problems are solved by the method and apparatus as set forth in the independent claims. 

The present invention seeks to overcome these problems and accordingly provides in one aspect, a method for 

30 rebuilding portions of parity groups resident on a failed DASD in a storage subsystem having a plurality of DASDs, 
each parity group including N data, P parity, and S spare blocks, each DASD storing K blocks, the method comprising 
the steps of: configuring an array of N+P+S DASDs; distributing K parity groups (where (K/N+P+S) is an integer) in 
synchronous array addresses across subsets of N+P DASDs of the array such that no two blocks from the same parity 
group reside on the same DASD, each DASD storing data or parity blocks from (K-K*S/(N+P+S)) parity groups; the 

35 method being characterised by the steps of: distributing K*S blocks as spare storage across the array such that each 
DASD includes K*S/(N+P+S) spare blocks thereon; and in the event of a single DASD failure, for each of the K-K*S/ 
(N+P+S) parity groups on the failed DASD, regenerating the lost data or parity block of the parity group of said failed 
DASD from the remaining data and parity blocks of said parity group, and writing the regenerated block into the spare 
block of said parity group such that no two blocks of the same parity group are distributed on the same DASD. 

40 The present invention thus facilitates the reconstruction of missing data and parity blocks and copies them back 

into spare block DASD locations, such that no two blocks of the same parity group are distributed on the same DASD. 

In a second aspect of the invention, there is provided a storage subsystem for accessing parity groups each com- 
prising N data blocks, P parity blocks and S spare blocks, the subsystem comprising: an array formed from N+P+S 
DASDs, each DASD storing K blocks; first means for distributing K parity groups (where K/(N+P+S) is an integer) 

45 across counterpart subsets of N=P DASDs selected from the array such that no two blocks from the same parity group 
are stored on the same DASD; means for distributing K*S blocks of storage as spare blocks such that each array DASD 
reserves K*S/(N+P+S) blocks thereon; identifying means for identifying any single DASD failure; and means responsive 
to any single DASD failure identified by the identifying means for processing each of the K-K*S/(N+P+S) parity groups 
of the failed DASD by regenerating the lost data block or parity block of the parity group of said failed DASD from the 

50 remaining data and parity blocks of said parity group, and writing the regenerated block into the spare block of said 
parity group such that no two blocks of the same parity group are distributed on the same DASD. 

Furthermore, in the subsystem and method of the present invention, the number of accesses to reconstruct missing 
data and parity blocks and their copyback into spare block locations is reduced in comparison with prior art methods. 
In addition, the throughput is maximized during reconstruction and copyback of missing data or parity blocks and 

55 subsequent reference thereto. 

The present invention is readily applicable in storage subsystems addressing two or more failure independent 
DASD arrays. Also, said method and means should be extensible such that the combinatorial design should be dis- 
tributable over multiple failure independent arrays and with regard to different sparing ratios or fractions. 
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In a preferred method P=S=1 and an array of N+2 DASDs is configured. The next step involves distributing K-K/ 
(N+2) parity groups in synchronous array addresses across subsets of N+1 DASDs of the array such that no two blocks 
from the same parity group reside on the same DASD. Concurrently, K blocks as spare storage are distributed across 
the array such that each DASD includes K/(N+2) spare blocks thereon. As in the general case, no more than one spare 
5 Storage block nor more than one parity block are stored on the same synchronous array address or on the same DASD. 

In the event of a single DASD failure, for each K-K/(N+2) parity groups, N blocks belonging to the group are logically 
combined from N other DASDs into a single block. Each single block is written into a counterpart one of the remaining 
K*(N+1 )/(N+2) spare blocks such that no two blocks of the same parity group are distributed on the same DASD. 

In an alternative embodiment (where P=1 , S=2), 2*K blocks of spare storage and (K-2*K/(N+3)) parity blocks are 
10 distributed such that no more than two spare storage blocks nor more than one parity block are stored on the same 
synchronous array address nor on the same DASD. This permits rebuilding and writing missing data to a first series 
of spare blocks after a first DASD has failed and repeating the process in the rarer event that a second DASD should fail. 

The present invention is also applicable to storage subsystems having multiple failure independent DASD arrays. 
Where multiple DASD failures occur in the same array, the missing data is first rebuilt on the spare space of the first 
15 array and the remaining missing data rebuilt on the spare space of the second array. 

Where storage subsystem expansion involves one array with distributed spare capacity and an array without a 
spare, such capacity may be conveniently shared. Also, the blocks representing the capacity of one or more spare 
DASDs can be distributed across multiple arrays so that no synchronous address or DASD has more than one such 
spare block increment. Next, the parity groups can be written across their respective arrays in a rotated or block offset 
20 manner. This would permit a uniform sharing. 

Other distributions of parity groups and sparing are disclosed which permit storage subsystem DASD array ex- 
pansion while maintaining a sparing fraction or ratio objective. 

Preferred embodiments of the invention will now be described, by way of example only, with reference to the 
accompanying drawings, in which: 
26 Figure 1 shows parity groups coextensive with storage bounds using dedicated parity and spare DASDs and re- 

construction of missing data or parity onto the spare DASD according to the prior art. 

Figure 2 depicts distributing K parity groups {(4+P)<(N+2)} and K spare spaces over an array of N+2 DASDs of 
capacity K blocks or spaces/DASD permitting recovery from a single DASD failure according to the invention, no two 
elements of the same group nor space being located on the same DASD. 
30 Figure 3 also sets out a distribution of K parity groups {(3+P)<(N+2)} and 2K spare spaces over an array of N+2 

DASDs of capacity K blocks/DASD permitting recovery of up to two DASD failures according to the invention, no two 
elements of the same group nor spare space being located on the same DASD. 

Figure 4 illustrates two DASD arrays with distributed sparing. 

Figures 5 and 6 show failure of one or more DASDs in a first one of two arrays and the rebuilding of missing data 
35 on the distributed spare spaces across both arrays. 
Figure 7 depicts partially distributed sparing. 

Figure 8 depicts a synchronous array of N+2 DASDs attached to a CPU by way of an array control unit. 
DASD Array Architecture 

40 

Referring now to figure 8, there is shown CPU 1 accessing DASDs 1 through N+2 over a path including channel 
3, array controller 5, and cache 1 3. Controller 5 operatively secures synchronism and accesses among any N+1 at a 
time of the N+2 DASDs i.e. DASD 1 through DASD N+2 over control path 7. 

N+1 streams of data defining a predetermined number of consecutive bytes can be exchanged in parallel to cache 
45 13 over data path 15. The N+1 streams of data may all be responsive to a single access (synchronous operation). 
Also, this invention is operative where each of the N+1 streams may be responsive to different accesses (asynchronous 
operations). Likewise, data can be exchanged serially by byte between CPU 1 and controller 5 over path 3 after a 
parallel to serial conversion in controller 5 in the read direction and a serial to parallel conversion in the write direction. 
In the read direction, data is supplied from cache 1 3 to controller 5 via data paths 9 and 11 . In the write direction, 
50 data is moved from the controller 5 to the cache 3 over paths 9 and 11 . 

Parity, DASD Failure and Sparing and Robustness 

DASD arrays use parity to protect against single DASD failures. If a DASD fails, data that used to be on that DASD 
55 can be reconstructed, as needed, using the data and parity on the surviving DASDs. This is illustrated in Table 1 in a 
DASD array of five DASDs. In this diagram. Pi is a parity block that protects the four data blocks labelled Di. Such a 
DASD array is called a 4+P array, since there is one parity block for every four data blocks. 
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Table 1 . 
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PI 


D2 


D3 


D4 
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10 Only one track (consisting of five blocks) is shown from each of the DASDs. P1 contains the parity or exclusive 

OR of the blocks labeled D1 on all the data DASDs. Similarly P2 is the exclusive OR of the blocks labeled D2 on all 
the DASDs, and so on. It should be noted that the parity blocks are distributed amongst all the drives to balance the 
workload. 

Such a DASD array is robust against single DASD failures. If DASD 1 were to fail, data on it can be recreated by 
IS reading data and parity from the remaining four DASDs and performing the appropriate exclusive OR operations. 

An array is said to enter "degraded mode" when a DASD in the array fails, because the performance and reliability 
of the array becomes degraded. Performance is degraded since every DASD access to a block on the failed DASD 
now requires DASD accesses to other DASDs in the array for reconstructing the block that is no longer accessible. 
Reliability is degraded, since if a second DASD fails before the failed DASD is replaced and the data on it reconstructed, 
20 the array will lose data. 

In the specification of this invention, the term "reliability" indicates the degree of immunity from data loss it pos- 
sesses. The higher the immunity from data loss, or the higher the mean time to data loss (MTTDL), the higher the 
reliability of the array. 

To minimize the probability of losing data and the length of time the array operates with degraded performance, 
25 arrays sometimes use "hot spare" DASDs that are an integral part of the array. The spare DASD(s) is (are) unused 
during normal operations. On a DASD failure, the data that used to be on the failed DASD is rebuilt to a spare DASD. 
The array is said to leave degraded mode operation and re-enter normal mode operation (sometimes called "fault 
tolerant mode") as soon as the data on the failed DASD has been rebuilt to a spare DASD. 

30 Traditional Sparing 

Referring now to figure 1 , there is shown an array in which parity groups are coextensive with storage bounds. 

The array uses dedicated parity and spare DASDs in aid of reconstruction of missing data or parity onto the spare 

DASD according to the prior art. 
35 The array in figure 1 comprises N+2 DASDs (for N=4). One of the DASDs is a spare DASD that is unused in normal 

operation and the remaining 5 DASDs operate as a4+Parray. This is termed "dedicated sparing". Each of the remaining 

N+1 (5) DASDs is divided into some number K of blocks or block locations. This is called a "parity group" which consists 

of N data and one parity block, i.e. one block from each of N+1 DASDs. The array then can store K parity groups, each 

with N+1 blocks or block locations. 
40 In this invention and the prior art such as Dunphy et al , all logically related N+1 blocks of data and parity are one 

to one mapped into N+1 blocks of addressible storage. Consequently, the distinction between data and storage oriented 

parity groups disappears. 

If data in any block location is lost, it can be reconstructed from the remaining N block locations of that parity group. 
When a DASD fails, K block locations from K different parity groups are lost. Each lost block location can be rebuilt 
45 using the corresponding block locations from the surviving DASDs. The rebuilt data is written to the spare DASD. 

Referring again to figure 1 , a failed DASD is shown as being crossed out. The reconstruction of the data contents 
is depicted as the XORing of the operative contents from the N other DASDs. In the Figure 1 array a DASD failure 
requires the array to read 6 block locations from each of four DASDs, and to write 6 blocks to the spare DASD which 
now replaces the failed DASD. That is, a total of 30 DASD l/Os are needed to complete the rebuild. 
50 Traditional sparing suffers two disadvantages. First, it does not utilize the spare DASDs in normal operation. Sec- 

ond, the non-use of the spare DASD raises a doubt as to its operability when needed as a substitute for a failed DASD. 

The second drawback could be overcome if the array controller were to periodically read and write data to various 
tracks of the spare DASD in order to verify its correct operation. 

55 Distributed Sparing 

Referring now to figure 2, there is shown distributed sparing and distributed parity in an array comprising N+2 
DASDs (N=4) as before. Some number of block locations (labelled si through s6) are left unused in each DASD such 
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that the total spare space on all the DASDs is equal to the capacity of a single DASD. Thus, this method leaves the 
same amount of spare space as current methods that use a dedicated spare DASD, but the spare space is distributed 
among all the DASDs rather than concentrated in a single DASD. Each parity group is extended to contain N data 
blocks, a parity block and a spare block. The data and parity block are also called "information blocks" to distinguish 

5 them from spare blocks. 

As is apparent from figure 2, no two blocks from a parity group are on the same DASD. Therefore, if a DASD fails, 
at most one information block from any parity group is lost and this lost block can be rebuilt from the remaining infor- 
mation blocks of the parity group on the other DASDs. The lost block from a parity group is rebuilt to the spare block 
for that parity group which is on another DASD. 

10 For example^ if DASD 3 were to fail in figure 2, block d1 would be rebuilt to DASD 6, block d2 to DASD 5, block 

p3 to DASD 4, block d5 to DASD 2 and block d6 to DASD 1 . Note that all the information blocks d4 and p4 survive the 
failure of DASD 3 and do not need to be rebuilt. In all, four blocks had to be read from each of 5 DASDs and one block 
had to be written to each of five DASDs, for a total of 25 DASD l/Os to complete the recovery process. This is an 
improvement over the 30 DASD l/Os that were needed to complete a rebuild in the traditional sparing approach. 

15 It follows that no two information blocks from a parity group end up on the same DASD following the rebuild, making 

it possible to tolerate another DASD failure at this point. 

Distributed Sparing Performance Consequences 

20 N+2 DASDs are used in normal mode (when no DASD has failed) as opposed to N+1 DASDs in current methods. 

Typical values for N are between 4 and 1 0. With N=4, the distributed sparing scheme uses 6 DASDs in parallel instead 
of 5 and potentially improves performance by 20% in normal mode. With N=10, distributed sparing could improve 
performance by 9% in normal mode. 

When a DASD fails, the array is said to operate in degraded mode. Distributed sparing has better performance in 

25 degraded mode than traditional sparing for two reasons. First, more parallelism (N+1 DASDs used instead of N in 
current methods) is involved. Secondly, in distributed sparing, only K-(K/(N+2)) blocks are lost (as opposed to K blocks 
for current methods) when a DASD fails. In the earlier example, 5 blocks were lost when a DASD failed, whereas the 
traditional sparing approach lost 6 blocks when a DASD failed. Since accesses to lost blocks require extra accesses, 
the fewer blocks lost the better the overall performance. 

30 Finally, distributed sparing has better performance during rebuild of lost data. In traditional sparing, the lost data 

is recovered to a single DASD which can be a bottleneck. With distributed sparing, the data is recovered in parallel to 
multiple DASDs so that no single DASD is a bottleneck. Furthermore, since less data is lost in this method, less data 
needs to be recovered. This explains why, in the example, distributed sparing only needed 25 l/Os instead of the 30 
l/Os required in traditional sparing. 

35 Distributed sparing requires (N+1 )*(K - (K/(N+2))) l/Os versus (N+1 )*K l/Os for traditional sparing. The number of 

l/Os needed for rebuild has been reduced by the fraction (N+1)/(N+2). 

As with traditional sparing, the rebuild of missing data and parity blocks preferably begins at cylinder zero of all 
DASDs and sweeps to the last cylinder. At the start of the sweep, the block lost from the first parity group would be 
rebuilt to DASD N+2; so DASD N+2 would be writing and the other DASDs would be reading. Then, for the second 

40 parity group, DASD N+1 would be writing and the other DASDs would be reading, and so on to the last parity group. 
Thus, in figure 2, DASDs 1, 2, 4 and 5 would read block 1, while DASD 6 would write block 1. Then, DASDs 1, 2, 4 
and 6 would read block 2 while DASD 5 would write block 2; and so on. 

Spare Distribution. Rebuild and Arm Stealing 

45 

If distributed spare space were placed on the last few cylinders of each DASD, arm stealing would occur between 

the reads and writes (operations) to a DASD during rebuild. This stems from the fact that the read operations require 
the arms to sweep from the first cylinder on down, whereas the write operations require the arms to be at the last 
cylinders. 

50 Preferably, spare space should be distributed at several different physical locations on different DASDs in order 

to minimize arm stealing during rebuild. However, physical address placement is involved in a trade off with other 
access performance factors. For instance, if all the spare space is at the extremities, then there would be less arm 
motion in normal operation of each drive. 

Referring now to figure 3, there is shown an array in which the spare capacity of up to two DASDs is distributed 

55 in a uniform pattern across N+3 DASDs. That is, in an array of N+3 DASDs each having a capacity of K blocks/DASD, 
then 2*K blocks are distributed thereacross such that no more than two spare blocks are located on the same stripe 
or on the same DASD. Also, figure 3 depicts distribution of K*P parity blocks such that no more than a single parity 
block is located on the same stripe and on the same DASD. 
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Referring again to figure 3, it is apparent that an N+2 DASD array of fixed size formed where N=4 requires that a 
4+P sized parity group tolerant of single DASD failures would have to be refornnatted and striped on a 3+P block basis. 
This would provide the 2*K spare capacity which could then be unifornnly distributed. 

It is considered well within the scope of this invention to extend the precept to higher numbers of failure tolerances. 

5 

Distributed Sparing on Systems With Multiple Arrays 

Referring now to figures 4-8, there are shown distributed sparing among two or more failure independent arrays 
of DASDs. Multiple array configurations are significant where storage subsystem capacity is expanded incrementally. 
10 That is, it is well within the scope of this invention to distribute and use spare capacity among failure independent 
addressible DASD arrays so as to minimize both performance and cost impacts. 

Storage Subsystem Expansion Where Each Array Includes One DASD Distributed Spare Capacity 

15 Referring now to figure 4, there are depicted two 4+P arrays each having one DASD spare capacity distributed 

uniformly by or within the individual array. If a DASD in an array were to fail, that DASD would be rebuild to the spare 
space distributed in that same array. That is, the spare space distributed in each array is available in common. Reference 
may be made to Dunphy et al, where dedicated spare DASDs were reserved in common among parity groups. However, 
Dunphy required that a spare DASD be dynamically and automatically switched to replace the failed DASD. Such 

20 switching is nowhere required in this invention. 

Referring again to figure 4, if DASD 1 from array 1 were to fail, followed by DASD 3 from array 2, DASD 1 would 
be rebuilt to spare space in array 1 and DASD 3 would be rebuilt to spare space on array 2. However, if DASD 1 from 
array 1 fails first; this causes it to be rebuilt to spare space in array 1 . Next, if DASD 4, also from array 1 fails, then no 
spare space is available on array 1 . In this invention, the contents of the failed DASD 4 that is the second failure should 

25 be rebuilt onto the spare space on array 2. 

Referring now to figure 5, there is depicted the failure of DASD 1. DASD 1 originally stored one data block from 
each of four parity groups (d1-d4), a parity block (p5) from a fifth parity group, and a spare or unassigned block (s6). 
The spares (s1-s6) are assigned such that no synchronous address (the same address position across each of the 
DASDs in the array) has more than one spare block. The same applies to each DASD. In this aspect of the invention, 

30 DASD 1 is rebuilt such that blocks d1 -d4 & p5 are determined by logically combining (XORing) the counterpart N other 
blocks of the parity group into a single block and writing the single block into a spare s(i) having the same synchronous 
address as the other blocks. Consequently, block d2 is formed by XORing the d2 blocks in the same address on DASDs 
2, 3, and 6 and parity p2 on DASD 4. The rebuilt block d2 is then written in the spare position on DASD 5. This is 
repeated for all but the last block s6. Since s6 is a spare block, no logical combining and writing actions are required. 

35 Referring now to figure 6, there is shown a second DASD failure (DASD 4) occurring in the same array (array 1) 

as the first failure. The sparing in both arrays is distributed such that each DASD has K/(N+2) spare blocks and and 
only one spare block can appear in the same synchronous address and DASD. This means that DASD 4 from array 
1 can be rebuilt by logically combining the remaining N DASDs in the group into a single block and writing the block 
into counterpart spare blocks in array 2. 

40 A storage subsystem can be expanded by adding one array at a time where each array comes with its own dis- 

tributed spare space. A limitation of this expansion as configured is that it is not possible to have a single spare that 
is shared by multiple arrays. Therefore, the cost of sparing may be higher than that which can be afforded by a system 
or that which is appropriate for the system. 

Referring again to figure 6, another limitation arises afterthe second failure, the blocks of array 1 are now scattered 

45 across 10 DASDs (4 surviving DASDs from array 1 and 6 from array 2) instead of the original 6 DASDs. Therefore, 
the simultaneous failure of any 2 DASDs from this group of 10 would cause data loss. In other words, as DASDs fail 
and are rebuilt to other arrays, the Mean-Time-To-Data-Loss (MTTDL) of the system gets somewhat worse. The MTTDL 
is calculated as the probability that a second DASD fails shortly after the first one has failed and before it has been rebuilt. 

50 Storage Subsystem Expansion Where Arrays Share DASD Distributed Spare Capacity 

Referring now to figure 7, array 1 is a 4+P with distributed sparing according to the precept of the invention. 
However, array 2 is also 4+P but without sparing. 

If a DASD in array 1 were to fail, five blocks would be lost (one of the blocks is a spare block), and there would be 
55 five spare blocks on the other five DASDs in the array to rebuild the five blocks lost. If a DASD in array 2 were to fail, 
6 blocks would be lost, and there would be 6 spare blocks on the 6 DASDs in array 1 to rebuild the 6 lost blocks. In 
this way, the distributed spare blocks in array 1 would be available to rebuild blocks of a single failure occurring in 
either array. 
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Advantages and Limitations of Partially Distributed Sparing 

Besides the advantage that such an approach allows us to share a single spare amongst multiple arrays and does 
not require each array to have a spare, it has the additional advantage that the spare space requirements can be 
adjusted with system growth by allowing the choice of either adding an array with a spare or an array without a spare. 

This aspect of the invention suffers the limitation that as DASDs fail and are replaced to other arrays, the MTTDL 
of the system will drop, until the failed DASDs are replaced and the dispersed data copied back. 

Spare DASD Distributed Across Multiple Arrays 

Referring now to Table 2, the spare blocks of a single spare DASD are shared across multiple arrays. In this regard, 
the spare is shared across two 2+P arrays. In the following embodiment, the distribution step and means are shown 
as they pertain to the laying out of parity groups and single DASD's work of spare space across two arrays. 



Table 2. 
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2 


3 


4 


5 


6 


7 


^DASD 


D1 


D1 


P1 


d1 


d1 


P1 


S1 




D2 


P2 


D2 


d2 


p2 


S2 


d2 




P3 


D3 


D3 


p3 


S3 


d3 


d3 




D4 


D4 


P4 


S4 


d4 


d4 


p4 




D5 


P5 


S5 


D5 


d5 


p5 


d5 




P6 


S6 


D6 


D6 


p6 


d6 


d6 




S7 


D7 


D7 


P7 


d7 


d7 


P7 





Data and parity blocks of array 1 are indicated by uppercase Ds and Ps; data and parity blocks of array 2 are 
indicated by lowercase ds and ps. Note that all the data and parity from array 1 are on DASDs 1 , 2, 3 and 4 and that 
all the data and parity from array 2 are on DASDs 4, 5, 6 and 7. Therefore, the MTTDL of either array is the same as 
any 2+P array with distributed sparing. 

The distribution step operates as follows: 
Referring to Tables 2 and 3, S1 is placed on block 1 of DASD 7, S2 on block 2 of DASD 6, and so on. That is, 
the spares are rotated across the 7 DASDs shown in a uniform way. Next, the blocks of array 1 are stored on DASDs 
1 , 2 and 3 as: 



Table 3. 
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^DASD 


D1 


D1 


P1 




D2 


P2 


D2 




P3 


D3 


D3 




D4 


D4 


P4 




D5 


P5 


D5 




P6 


D6 


D6 




D7 


D7 


P7 










etc... 



However, if one of these blocks must be a spare as determined by the spare rotation above, then give priority to 
the spare and shift any data and parity blocks to the right to accommodate the spare. Thus as expressed in Table 4, 
the first four rows of the placement of array 1 are unaffected by spares, but the other three rows are affected by spare 
placement. The result causes array 1's placement to become: 



Table 4. 
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^ DASD 


D1 


D1 


PI 






D2 


P2 


D2 
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Table 4. (continued) 



10 
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2 


3 
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^ DASD 


P3 


D3 


D3 






D4 


D4 


P4 






D5 


P5 


S5 


D5 




P6 


S6 


D6 


D6 




S7 


D7 


D7 


P7 


.etc... 



Similarly, the data and parity blocks of array 2 are placed on DASDs 4, 5 and 6 except when they need to be shifted 
to the right to satisfy the placement of a spare or array 1 . 

The result has each DASD with 4 data blocks, 2 parity blocks and a spare block, so there is uniform distribution 
16 of spares and parity across all DASDs in the array. 

Groups of Arrays 

In the above embodiment, the storage subsystem would expand by adding two 2+P arrays at a time. Belatedly 
20 each two array unit would have the one spare DASDs worth of spare blocks to share between them. 

In this embodiment, assume that the storage subsystem has expanded to 1 4 DASDs (2 array groups of 7 DASDs). 
Each array group of 7 DASDs has two 2+P arrays and 1 DASD equivalent of distributed spare blocks. Let the arrays 
in group 1 be array 1 and array 2; let the arrays in group 2 be array 3 and array 4. Consider that a DASD in array 1 
fails; then it would be rebuilt to spare space in group 1. 
25 Assume that a DASD in array 2 fails. Even though array 2 is part of group 1 , we would allow it to be rebuilt to the 

spare space in group 2, since no spare space remains in group 1 . This dispersal of data from a DASD in a first group 
to DASDs in other groups results in a decrease in MTTDL of the storage subsystem until the failed DASDs have been 
replaced and data has been copied back. 

30 Storage System Expansion by Array Multiples 

Expansion of a storage subsystem by an array multiples bears resemblance to the previous distribution of groups 
and spare blocks. For example, suppose two 2+P arrays share a single spare. The subsystem would be initially con- 
figured to support a single 2+P+S array. The system would expand by another 2+P array, resulting in a system with 
35 two 2+P arrays and a spare distributed amongst them. The layout of data and spares when there is only a single 2+P 
system is shown in Table 5. 



Table 5. 



40 
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S7 


D7 
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P7 





As the next 2+P array is added to the same group. Table 6 depicts a distribution which shares the spare between 
the two 2+P arrays in the group without any data movement: 



Table 6. 
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Table 6. (continued) 
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S4 
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p4 


D5 
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D5 


d5 
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d5 


P6 


S6 


D6 


D6 


p6 


d6 


d6 
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D7 


D7 


P7 


d7 


d7 


P7 



Advantageously, this distribution allows the K block capacity of a single spare DASD to be shared between multiple 
arrays, but does not require that expansion be in terms of multiple arrays. This avoids oversparing with respect to an 
ultimate sparing fraction objective. 



Extensions 

The invention has been described where both the parity blocks and spare DASD capacity have been distributed. 
One extension is to preserve the parity blocks on a dedicated DASD and distribute just the spare capacity. 

Combinations of the aforementioned parity group and sparing distributions could be employed. It is possible to 
mix the distribution pattern in a multi-group array. That is, the storage subsystem is capable of expansion by adding 
an array with spare, an array without a spare or multiple arrays with a shared spare between them at different times. 
The particular patterns used would be a function of the size of the system, the spare replacement policy and the sparing 
ratio that is considered acceptable. 



Claims 

25 

1. A method for rebuilding portions of parity groups resident on a failed DASD in a storage subsystem having a 
plurality of DASDs, each parity group including N data, P parity and S spare blocks, each DASD storing K blocks, 
the method comprising the steps of: 

30 

configuring an array of N+P+S DASDs; 



35 



40 



45 



50 



55 



distributing K parity groups (where (K/N+P+S) is an integer) in synchronous array addresses across subsets 
of N+P DASDs of the array such that no two blocks from the same parity group reside on the same DASD, 
each DASD storing data or parity blocks from (K-K*S/(N+P+S)) parity groups; the method being characterised 
by the steps of: 

distributing K*S blocks as spare storage across the array such that each DASD includes K*S/(N+P+S) spare 
blocks thereon; and 

in the event of a single DASD failure, for each of the K-K*S/(N+P+S) parity groups on the failed DASD, regen- 
erating the lost data or parity block of the parity group of said failed DASD from the remaining data and parity 
blocks of said parity group, and writing the regenerated block into the spare block of said parity group such 
that no two blocks of the same parity group are distributed on the same DASD. 

A method as claimed in claim 1 , where P=S=1 . 

A method as claimed in claim 2, wherein each parity group is written into N+1 storage locations, and upon failure 
of a single DASD and rebuilding of said parity groups, only K-(K/(N+2)) storage locations are rendered unavailable. 

A method as claimed in claim 2 or claim 3, wherein all array DASDs other than the failed DASD are addressable 
and responsive to access commands whether operated in fault tolerant or degraded modes. 

A method as claimed in claim 2, wherein the steps of distributing up to K parity groups and K spare storage blocks 
further comprises the steps of: 

distributing said parity groups and spare blocks across N+2 DASDs such that no more than one spare storage 
block nor more than one parity block are stored on the same synchronous array address or on the same DASD. 
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10 



6. A method as claimed in claim 1 , wherein S=2 and the step of distributing comprises: 
distributing 2*K blocks of spare storage and K*P parity blocks such that no more than two spare storage 

blocks nor more than P parity blocks are stored on the same synchronous array address or on the same DASD. 

7. A method as claimed in any of claims 2 to 6, wherein the parity block from each of the K groups is written to a 
dedicated one of the N+1+S DASDs. 

8. A method as claimed in claim 2, wherein each parity group is accessed concurrently from a selective subset of 
N+1 of the N+2 DASDs. 

9. A method as claimed in claim 2, wherein each parity group is accessed non-concurrently from a selective subset 
of N+1 of the N+2 DASDs. 

1 0. A method as claimed in claim 2, wherein each DASD includes cyclic track storage means of M tracks; and means 
15 for moving from track to track and reading or writing data or parity blocks selectively along one or more tracks; 

and further wherein the step of regenerating the lost data includes the steps of: 

(1) positioning the moving means to a predetermined location on the cyclic track storage means of each of 
the remaining N+1 DASDs and traversing all m tracks starting from the predetermined location; 

20 

(2) at the start of the traverse, logically combining and writing the block lost from the first parity group onto the 
spare block of (N+2)nd DASD concurrent with a reading operation performed by the remaining N or other 
DASDs; 

25 (3) continuing logically combining and writing the block lost from the second parity group on the spare block 

of the (N+1)st DASD concurrent with a reading operation performed by the remaining N DASDs; and 

(4) repeating step (3) until each block stored on the failed DASD from the K-K/(N+2) parity groups is recreated 
and rewritten into a counterpart spare block across each of the remaining DASDs. 

30 

11. A storage subsystem for rebuilding portions of parity groups resident on a failed DASD, the parity groups each 
comprising N data blocks, P parity blocks and S spare blocks, the subsystem comprising: 

an array formed from N+P+S DASDs, each DASD storing K blocks; 

35 

first means for distributing K parity groups (where K/N+P+S is an integer in synchronous addresses across 
subsets of N+P DASDs selected from the array such that no two blocks from the same parity group are stored 
on the same DASD; 

40 means for distributing K*S blocks of storage as spare blocks such that each array DASD reserves K*S/(N+P+S) 

spare blocks thereon; 

identifying means for identifying any single DASD failure; and 

45 means responsive to any single DASD failure identified by the identifying means for processing each of the 

K-K*S/(N+P+S) parity groups of the failed DASD by regenerating the lost data or parity block of the parity 
group of said failed DASD from the remaining data and parity blocks of said parity group, and writing the 
regenerated block into the spare block of said parity group such that no two blocks of the same parity group 
are distributed on the same DASD. 

50 

12. A storage subsystem as claimed in claim 11 , wherein the K*S spare blocks are distributed such that no two blocks 
occupy the same array address and the same DASD. 

13. A storage subsystem as claimed in claim 11 or claim 12, wherein P=S=1 and upon failure of a single DASD and 
55 rebuilding of said parity groups, only K-(K/(N+2)) block storage locations are rendered unavailable for array use. 

14. A storage subsystem as claimed in claim 13, wherein the parity from each of the K-K(N+2) groups is written to a 
dedicated one of the N+2 DASDs. 
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15. A storage subsystem as claimed in claim 13 or claim 14, wherein each DASD includes cyclic track storage means 
of m tracks; and means for moving from track to track and reading or writing data or parity blocks selectively along 
one or more tracks; and further wherein the means for logically combining and writing the K parity groups includes: 

5 means for positioning the moving means to a predetermined location on the cyclic track storage means of 

each of the remaining N+1 DASDs and for traversing all m tracks starting from the predetermined location; 

third means at the start of the traverse, for logically combining and for writing the block lost from the first parity 
group on the spare block of (N+2)nd DASD concurrent with a reading operating performed by the remaining 
10 N other DASDs; and 

fourth means including the third means for continuing logically combining and writing the block lost from the 
second parity group on the spare block of the (N+1)st DASD concurrent with a reading operation performed 
by the remaining N DASDs, and for repeating the combining and writing until each block stored on the failed 
15 DASD from the K parity groups is recreated and rewritten into a counterpart spare block across the remaining 

DASDs. 

16. A storage subsystem as claimed in any of claims 11 to 15 wherein S=2, said distributing means distributing the 
capacity equivalent of up to 2*K blocks of storage as spare blocks across the array of N+P+2 DASDs such that 

20 no more than two spare storage locations nor more than one parity block are stored on the same synchronous 

address or on the same DASD. 

17. A storage subsystem for rebuilding portions of parity groups resident on a failed DASD, each parity group com- 
prising N data blocks, P parity blocks, the subsystem comprising : 

25 

a first and a second failure independent array each formed from at least N+P+1 DASDs, each DASD having 
the capacity to store K blocks; 

first means for distributing K parity groups (where K/(N+P+1) is an integer) across N+P+1 DASDs of either 
30 the first or second arrays mutually exclusively such that no two blocks from the same parity group are stored 

on the same DASD; 

means for distributing K blocks of storage as spare blocks across N+P+1 DASDs of the first array and K blocks 
of storage as spare blocks across N+P+1 DASDs of the second array such that in each array only one storage 
35 block resides at each synchronous address and on each DASD; 

second means for identifying a first or a second DASD failure occurring in either the first or second arrays; 

means responsive to any single DASD failure identified by the second means, for processing each of the K- 
40 K/(N+P+1 ) parity groups of the failed DASD by regenerating the lost data or parity block of each parity group 

of the failed DASD from the remaining data and parity blocks of said parity group, and writing the regenerated 
block into the spare block of said parity group such that no two blocks of the same parity group are distributed 
on the same DASD; and 

45 means responsive to any second DASD failure identified in the same or other array by the second means, for 

processing, in the array having available spare blocks, each of the K-K/(N+P+1) parity groups of the second 
failed DASD by regenerating the lost data or parity block of each parity group of the failed DASD from the 
remaining data and parity blocks of said parity group, and writing the regenerated block into one of the re- 
maining K*(N+P)/(N+P+1) spare blocks such that no two blocks of the same parity group are distributed on 

50 the same DASD. 

1 8. A storage subsystem as claimed in claim 1 7, wherein the means responsive to said second DASD failure maintains 
the subsystem in a lossless information state only if the second failure occurs after the means responsive to the 
first failure has rebuilt the parity groups of the failed DASD and written them into counterpart spare blocks. 



55 



19. A method for rebuilding portions of parity groups resident on a failed DASD in a storage subsystem comprising a 
first and a second failure independent array of DASDs, each DASD having the capacity to store K blocks, each 
parity group including N data and P parity blocks, comprising the steps of: 
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(a) configuring a first array of N+P+1 DASDs and a second array of N+P DASDs; 

(b) distributing up to K parity groups (where K/(N+P+1 ) is an integer) in synchronous array addresses across 
subsets of N+P DASDs out of N+P+1 DASDs of the first array and K parity groups across N+P DASDs of the 

5 second array such that no two blocks fronn the same parity group reside on the same DASD, and distributing 

K blocks as spare storage across the first array such that each DASD includes K/(N+P+1 ) spare blocks thereon, 
each synchronous address and DASD having only one spare block thereon; 

(c) in the event of a single DASD failure occurring in the first array for each of the K-K(N+P+1 ) parity groups, 
10 regenerating the lost data or parity block of the parity group of said failed DASD from the remaining data and 

parity blocks of said parity group, and, writing the regenerated block into one of the remaining K*(N+P)/(N+P+1 ) 
spare blocks such that no two blocks of the same parity group are distributed on the same DASD; and 

(d) in the event of a single DASD failure occurring in the second array for each of the K parity groups, regen- 
15 crating the lost data or parity block of the parity group of said failed DASD from the remaining data and parity 

blocks of said parity group, and, writing the regenerated block into one of the K spare blocks located on the 
first array. 



20 Patentanspriiche 

1 . Ein Verfahren zum Wiederherstellen von Teilen von Paritatsgruppen, die auf einem defekten Direktzugriffsspeicher 
(DASD) in einem Speicheruntersystem mit einer Vielzahl von DASDs resident sind, wobei jede Paritatsgruppe N 
Datenblocke, P Paritatsblocke und S Reserveblocke enthalt und jeder DASD K Blocke abspeichert, wobei das 

25 Verfahren die folgenden Schritte umfafBt: 

Konfiguration eines Feldes aus N+P+S DASDs; 

Verteilen von K Paritatsgruppen (wobei K/(N+P+S) eine Ganzzahl ist) in synchronen Feldadressen quer uber 
30 Teilmengen von N+P DASDs des Feldes, so da3 nicht zwei Blocke aus der gleichen Paritatsgruppe auf dem 

gleichen DASD resident sind; wobei jeder DASD Daten- oder Paritatsblocke aus (K-K*S/(N+P+S)) Paritats- 
gruppen abspeichert, wobei das Verfahren durch folgende Schritte gekennzeichnet ist: 

Verteilen von K*S Blocken als Reservespeicher uber das Feld, so daR jeder DASD K*S/(N+P+S) Reserve- 
35 blocke darauf beinhaltet; und 

im Falle einer einzigen DASD-Storung fur jede der K-K*S/(N+P+S) Paritatsgruppen auf dem gestorten DASD 
Regenerieren der verlorenen Daten- oder Paritatsblocke der Paritatsgruppe des gestorten DASD aus den 

restlichen Daten- und Paritatsblocken der betreffenden Paritatsgruppe und Schreiben des regenerierten 
40 Blocks in den Reserveblock der Paritatsgruppe, so da3 keine zwei Blocke der gleichen Paritatsgruppe auf 

dem gleichen DASD verteilt sind. 

2. Ein Verfahren gemaR Anspruch 1, worin P=S=1 ist. 

45 3. Ein Verfahren gema3 Anspruch 2, worin jede Paritatsgruppe in N+1 Speicherstellen geschrieben wird, und bei 
Storung eines einzigen DASD und Neuaufbau der Paritatsgruppen nur K-(K/(N+2)) Speicherstellen nicht verfugbar 
gemacht werden. 

4. Ein Verfahren gemaB Anspruch 2 oder Anspruch 3, worin alle Felder-DASDs, abgesehen von dem defekten DASD, 
50 adressierbar sind und auf Zugriffsbefehle ansprechen, unabhangig davon, ob sie im fehlertoleranten oder im her- 

abgesetzten Modus betrieben werden. 

5. Ein Verfahren gemaB Anspruch 2, worin die Schritte des Aufteilens von bis zu K Paritatsgruppen und K Reserve- 
speicherblocke ferner die folgenden Schritte umfaBt: 

55 Verteilen der Paritatsgruppen und Reserveblocke uber N+2 DASDs, so da3 weder mehr als ein Reservespeicher- 

block noch mehr als ein Paritatsblock unter der gleichen synchronen Feldadresse oder auf dem gleichen DASD 
gespeichert sind. 
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Ein Verfahren gemaG Anspruch 1, worin S=2 ist und der Schritt des Verteilens umfa3t: 

Verteilen von 2*K Blocken der Reservespeicherung und K*P Paritatsblocken, so da3 weder mehr als zwei Reser- 
vespeicherblocke noch mehr als P Paritatsblocke unter der gleichen synchronen Feldadresse oder auf dam glei- 
chen DASD gespeichert sind. 

Ein Verfahren gema3 einem beliebigen der Anspruche 2 bis 6, worin der Paritatsblock jeder der K Gruppen auf 
einen test zugeordneten der N+1+S DASDs geschrieben wird. 

Ein Verfahren gemaB Anspruch 2, worin auf jede Paritatsgruppe parallel von einer selektiven Teilmenge von N+1 
der N+2 DASDs aus zugegriffen wird. 

Ein Verfahren gema3 Anspruch 2, worin auf jede Paritatsgruppe im Nichtparallelbetrieb von einer selektiven Teil- 
menge von N+1 der N+2 DASDs aus zugegriffen wird. 

Ein Verfahren gema3 Anspruch 2, worin jeder DASD zyklische Spurspeichermittel mit M Spuren, und Mittel zum 
Bewegen von Spur zu Spur und zum Lesen oder Schreiben von Daten- oder Paritatsblocken selektiv auf einer 
Oder mehreren Spuren beinhaltet; und femer der Schritt des Regenerierens der verlorenen Daten die folgenden 
Schritte umfa3t: 

(1) Positionieren der Bewegungsmittel auf eine vorgegebene Stelle im zyklischen Spurspeichermittel jedes 
der restlichen N+1 DASDs und Uberqueren aller m Spuren, ausgehend von der vorgegebenen Stelle; 

(2) zu Beginn der Uberquerung logisches Kombinieren und Schreiben des aus der ersten Paritatsgruppe ver- 
loren gegangenen Blocks in den Reserveblock des (N+2)-ten DASD parallel zu einer Leseoperation, die von 
den restlichen N oder anderen DASDs durchgefuhrt wird; 

(3) fortgesetztes logisches Kombinieren und Schreiben des aus der zweiten Paritatsgruppe verloren gegan- 
genen Blocks in den Reserveblock des (N+1)-ten DASD parallel zu einer Leseoperation, die von den restlichen 
N DASDs durchgefuhrt wird; und 

(4) Wiederholen des Schritts (3) bis jeder im defekten DASD abgespeicherte Block aus den K-K/(N+2) Pari- 
tatsgruppen wiederhergestellt und wieder in einen Gegenstuck-Reserveblock quer uber jeden der restlichen 
DASDs geschrieben ist. 

Ein Speicheruntersystem zum Wiederaufbau von Teilen von Paritatsgruppen, die auf einem defekten DASD resi- 
dent sind, wobei die Paritatsgruppen jeweils aus N Datenblocken, P Paritatsblocken und S Reserveblocken be- 
stehen, wobei das Untersystem umfaBt: 

ein Feld, das aus N+P+S DASDs gebildet wird, wobei jeder DASD K Blocke abspeichert; 

erste Mittel zum Verteilen von K Paritatsgruppen (wobei K/(N+P+S) eine Ganzzahl ist) auf synchrone Adressen 
quer Ober Teilmengen von N+P DASDs, die so vom Feld ausgesucht sind, da3 nicht zwei Blocke aus der 
gleichen Paritatsgruppe auf dem gleichen DASD gespeichert werden; 

Mittel zum Verteilen von K*S Speicherblocken als Reservespeicherblocke, so da3 jeder Feld-DASD K*S/ 
(N+P+S) Reserveblocke davon reserviert; 

Identifizierungsmittel zum Identifizieren jeder einzigen DASD-Storung; und 

Mittel, die auf jede einzelne DASD-Storung ansprechen, die von den Identifizierungsmittein identifiziert wird, 
zum Verarbeiten jeder der K-K*S/(N+P+S) Paritatsgruppen auf dem gestorten DASD durch Regenerieren der 
verlorenen Daten- oder Paritatsblocke der Paritatsgruppe des defekten DASD aus den restlichen Daten- und 
Paritatsblocken der betreffenden Paritatsgruppe, und Schreiben des regenerierten Blocks in den Reserveblock 
der Paritatsgruppe, so da3 keine zwei Blocke der gleichen Paritatsgruppe uber den gleichen DASD verteilt 
sind. 

Ein Speicherungsuntersystem gemaB Anspruch 11 , in dem die K*S Reserveblocke so verteilt sind, da3 keine zwei 
Blocke die gleiche Feldadresse und den gleichen DASD besetzen. 
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13. Ein Speicherungsuntersystem gemaB Anspruch 11 Oder Anspruch 12, in dem P=S=1 ist und bei Storung eines 
einzigen DASD und Wiederaufbau der Paritatsgruppen nur K-(K/(N+2)) Blockspeicherstellen nichtfurdie Feldan- 
wendung verfiigbar gemacht werden. 

5 14. Ein Speicherungsuntersystenn gemaB Anspruch 13, worin die Paritat von jeder der K-K(N+2) Gruppen auf einen 
fest zugeordneten der N+2 DASDs gescl^rieben wird. 

15. Ein Speicherungsuntersystenn gemafB Anspruch 13 oder Anspruch 14, in dem jeder DASD zyklische Spurspei- 
chermittel mit m Spuren, und IVIittel zum Bewegen von Spur zu Spur und zum Lesen oder Schreiben von Daten- 

10 Oder Paritatsblocken selektiv auf eine oder mehrere Spuren beinhaltet; und worin ferner das Mittel zum logischen 

Kombinieren und Schreiben der K Paritatsgruppen unnfaBt: 

Mittel zum Positionieren der Bewegungsmittel auf eine vorgegebene Stelle im zyklischen Spurspeichermittel 
jedes 

15 

der restlichen N+1 DASDs und zum Uberqueren aller m Spuren, ausgehend von der vorgegebenen Stelle; 

dritte Mittel am Anfang des Uberquerens, zum logischen Kombinieren und zum Schreiben des aus der ersten 
Paritatsgruppe verloren gegangenen Blocks in den Reserveblock des (N+2)-ten DASD parallel mit einer Le- 
20 seoperation, die von den restlichen N anderen DASDs durchgefuhrt wird; und 

vierte Mittel, die die dritten Mittel einschlieBen, zum Fortsetzen des logischen Kombinierens und Schreibens 
des aus der zweiten Paritatsgruppe verloren gegangenen Blocks in den Reserveblock des (N+1)-ten DASD 
parallel zu einer Leseoperation, die von den restlichen N DASDs durchgefuhrt wird, und zum Wiederholen 
25 des Kombinierens und des Schreibens, bis jeder auf dem defekten DASD gespeicherte Block von den K 

Paritatsgruppen neu geschaffen und in einem Gegenstuck-Reserveblock quer Ober die restlichen DASDs neu 
geschrieben ist. 

16. Ein Speicherungsuntersystem gemaB einem beliebigen der Anspruche 11 bis 15, indemS=2 ist, die Verteilermittel 
30 die Kapazitat von entsprechend bis zu 2*K Speicherblocken als Reserveblocke uber das Feld der N+P+2 DASDs 

verteilen, so daB weder mehr als zwei Resen/espeicherstellen noch mehr als ein Paritatsblock unter der gleichen 
synchronen Feldadresse oder auf dem gleichen DASD gespeichert sind. 

17. Ein Speicherungsuntersystem zum Neuaufbau von Paritatsgruppen, die auf einem defekten DASD resident sind, 
35 wobei jede Paritatsgruppe N Datenblocke, P Paritatsblocke umfaBt und das Untersystem beinhaltet: 

ein erstes und ein zweites fehlerunabhangiges Feld, jeweils gebildet aus mindestens N+P+1 DASDs, wobei 
jeder DASD die Kapazitat zum Abspeichern von K Blocken aufweist; 

40 erste Mittel zum Verteilen von K Paritatsgruppen (wobei K/(N+P+S) eine Ganzzahl ist) uber N+P+1 DASDs 

entweder des ersten oder des zweiten Feldes, die sich gegenseitig ausschlieBen, so daB keine zwei Blocke 
aus der gleichen Paritatsgruppe auf dem gleichen DASD abgespeichert sind; 

Mittel zum Verteilen von K Speicherblocken als Reserveblocke quer uber N+P+1 DASDs des ersten Feldes, 
45 und von K Speicherblocken als Reserveblocke quer uber N+P+1 DASDs des zweiten Feldes, so daB in jedem 

Feld nur ein Speicherblock an jeder synchronen Adresse und auf jedem DASD resident ist; 

zweite Mittel zum Identifizieren einer ersten oder einer zweiten DASD-Storung, die entweder im ersten oder 
im zweiten Feld auftritt; 

50 

Mittel, die auf jede einzelne durch die zweiten Mittel identifizierte DASD-Storung ansprechen, um jede der K- 
K/(N+P+1) Paritatsgruppen des gestorten DASD zu verarbeiten durch Regenerieren der verlorenen Daten- 
oder Paritatsblocke jeder Paritatsgruppe des gestorten DASD aus den restlichen Daten- und Paritatsblocken 
der betreffenden Paritatsgruppe, und Schreiben des regenerierten Blocks in den Reserveblock der Paritats- 
55 gruppe, so daB keine zwei Blocke der gleichen Paritatsgruppe uber den gleichen DASD verteilt sind; und 

Mittel, die auf jede zweite durch die zweiten Mittel in dem gleichen oder einem anderen Feld durch die zweiten 
Mittel identifizierte DASD-Storung ansprechen, um in dem Feld, das Reserveblocke zur Verfugung hat, jede 
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der K-K/(N+P+1) Paritatsgruppen des zweiten gestorten DASD zu verarbeiten durch Regenerieren der ver- 
lorenen Daten- oder Paritatsblocke in jeder Paritatsgruppe des defekten DASD aus den restlichen Daten- und 
Paritatsblocken der betreffenden Paritatsgruppe, und Schreiben des regenerierten Blocks in einen der restli- 
chen K*(N+P)/(N+P+1 ) Reserveblocke, so da3 keine zwei Blocke der gleichen Paritatsgruppe uber den glei- 
5 Chen DASD verteilt sind. 

18. Ein Speicherungsuntersystem gema3 Anspruch 17, in demdasauf diezweite DASD-Storung ansprechende Mittel 
das Untersystem nur dann in einem verlustfreien Intormationszustand halt, wenn die zweite Storung stattfindet 
nachdem das auf die erste Storung ansprechende Mittel die Paritatsgruppen des defekten DASD wiederhergestellt 

10 hat und sie in Gegenstuck-Reserveblocke geschrieben hat. 

19. Ein Verfahren zum Wiederherstellen von Teilen von Paritatsgruppen, die aut einem defekten DASD in einem Spei- 

cheruntersystem mit einem ersten und zweiten storungsunabhangigen DASD-Feld resident sind, wobei jede Pa- 
ritatsgruppe die Kapazitat zum Speichem von K Blocken aufweist, jede Paritatsgruppe N Daten- und P Paritats- 
15 blocke beinhaltet, und das aus folgenden Schritten besteht: 

(a) Konfiguration eines ersten Feldes aus N+P+1 DASDs und eines zweiten Feldes aus N+P DASDs; 

(b) Verteilen von bis zu K Paritatsgruppen (wobei K/(N+P+S) eine Ganzzahl ist) in synchrone Feldadressen 
20 quer uber Teilmengen von N+P DASDs aus N+P+1 DASDs aus dem ersten Feld und K Paritatsgruppen uber 

N+P DASDs des zweiten Feldes, so da3 keine zwei Blocke aus der gleichen Paritatsgruppe auf dem gleichen 
DASD resident sind, und Verteilen von K Blocken als Reservespeicher quer uber das erste Feld, so daB jeder 
DASD K/(N+P+1 ) darin Reserveblocke aufweist, wobei jede Synch ronadresse und jeder DASD darin nur einen 
einzigen Reserveblock aufweist; 

25 

(c) bei Vorkommen einer einzigen DASD-Storung, die im ersten Feld auftritt, fur jede der K-K/(N+P+1) Pari- 
tatsgruppen Regenerieren des verlorenen Daten- oder Paritatsblocks der Paritatsgruppe des defekten DASD 
aus den restlichen Daten- und Paritatsblocken der betreffenden Paritatsgruppe und Schreiben des regene- 
rierten Blocks in ein Gegenstuck eines der restlichen K*(N+P)/(N+P+1) Reserveblocke, so da3 keine zwei 

30 Blocke der gleichen Paritatsgruppe uber den gleichen DASD verteilt sind; und 

(d) bei Vorkommen einer einzigen DASD-Storung, die im zweiten Feld auftritt, fur jede der K Paritatsgruppen 
Regenerieren des verlorenen Daten- oder Paritatsblocks der Paritatsgruppe des defekten DASD aus den 
restlichen Daten- und Paritatsblocken der betreffenden Paritatsgruppe und Schreiben des regenerierten 

35 Blocks in einen einzigen der K Reserveblocke, die im ersten Feld angeordnet sind. 



Revendications 

40 1. Procede pour reconstituer des parties de groupes de parite residant sur une unite de stockage a acces direct 
defaillante dans un sous-systeme de stockage comportant une plural ite d'unites de stockage a acces direct, chaque 
groupe de parite comprenant N blocs de donnees, P blocs de parite et S blocs de reserve, chaque unite de stockage 
a acces direct stockant K blocs, le procede comprenant les etapes consistant a : 

45 configurer un agencement de N+P+S unites de stockage a acces direct ; 

distribuer K groupes de parite (ou (K/N+P+S) est un nombre entier) dans des adresses d'agencement syn- 
chrones a travers des sous-ensembles de N+P unites de stockage a acces direct de I'agencement d'une 
maniere telle que deux blocs du meme groupe de parite ne se trouvent pas sur la meme unite de stockage a 
50 acces direct, chaque unite de stockage a acces direct stockant des blocs de donnees ou de parite a partir des 

K-K*S/(N+P+S) groupes de parite ; le procede etant caracterise par les etapes consistant a : 

distribuer K * S blocs comme stockage de reserve a travers I'agencement d'une maniere telle que chaque 
unite de stockage a acces direct comprend K*S/(N+P+S) blocs de reserve sur celle-ci ; et 

55 

dans le cas de la defaillance d'une seule unite de stockage a acces direct, pour chacun des K-K*S/(N+S) 
groupes de parite sur I'unite de stockage a acces direct defaillante, regenerer le bloc de donnees ou le bloc 
de parite perdu dudit groupe de parite de ladite unite de stockage a acces direct defaillante a partir des blocs 
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10 



de donnees et de parite restants dudit groupe de parite et ecrire le bloc regenere dans le bloc de reserve dudit 
groupe de parite d'une maniere telle que deux blocs du meme groupe de parite ne sont pas distribues sur la 
meme unite de stockage a acces direct. 

. Procede selon la revendication 1 , ou P=S=1 . 

. Procede selon la revendication 2, dans lequel chaque groupe de parite est decrit dans N+1 emplacements me- 
moires et sur defaillance d'une seule unite de stockage a acces direct et reconstitution desdits groupes de parite, 
seul K-(K/(N+2)) emplacements memoires sont rendus indisponibles. 

4. Procede selon la revendication 2 ou la revendication 3, dans lequel toutes les unites de stockage a acces direct 
de I'agencement autres que I'unite de stockage a acces direct defaillante sont adressables et sensibles aux ordres 
d'acces que se soit en mode de tolerance aux defaillances ou en mode degrade. 

15 5. Procede selon la revendication 2, dans lequel les etapes consistant a distribuer K groupes de parite et K blocs de 
stockage de reserve comprend de plus les etapes consistant a : 

distribuer lesdits groupes de parite et lesdits blocs de reserve a travers N + 2 unites de stockage a acces direct 
d'une maniere telle que pas plus qu'un bloc de stockage de reserve ni pas plus qu'un bloc de parite ne soit stocke 
sur la meme adresse d'agencement synchrone ou sur la meme unite de stockage a acces direct. 

20 

6. Procede selon la revendication 1 , dans lequel S = 2 et I'etape de distribution comprend I'etape consistant a : 
distribuer 2*K blocs de stockage de reserve et K*P blocs de parite d'une maniere telle que pas plus que deux blocs 
de stockage de reserve ni pas plus que P blocs de parite sont memorises sur la meme adresse d'agencement 
synchrone ou sur la meme unite de stockage a acces direct. 

25 

7. Procede selon Tune quelconque des revendications 2 a 6, dans lequel le bloc de parite de chacun des K groupes 
est ecrit dans une unite de stockage a acces direct specialisee parmi les N+1 +S unites de stockage a acces direct. 

8. Procede selon la revendication 2, dans lequel on accede simultanement a chaque groupe de parite dans un sous- 
30 ensemble selectionne de N+1 unites de stockage a acces direct parmi les N+2 unites de stockage a acces direct. 

9. Procede selon la revendication 2, dans lequel on accede de maniere non simultanee a chaque groupe de parite 
dans un sous-ensemble selectionne de N+1 unites de stockage a acces direct parmi les N+2 unites de stockage 
a acces direct. 

35 

10. Procede selon la revendication 2, dans lequel chaque unite de stockage a acces direct comprend un moyen de 
memorisation a pistes cycliques de M pistes ; et un moyen pour se deplacer d'une piste a une autre piste et lire 
ou ecrire des blocs de donnees ou de parite selectivement selon une ou plusieurs pistes ; et dans lequel de plus 
I'etape de regeneration des donnees perdues comprend les etapes consistant a : 

40 

(1) positionner le moyen de deplacement a un emplacement predetermine sur le moyen de memorisation a 
pistes cycliques de chacune des N+1 unites de stockage a acces direct restantes et se deplacer sur la totalite 
des m pistes en commengant a partir de I'emplacement predetermine ; 

45 (2) au debut du deplacement, combiner et ecrire logiquement le bloc perdu du premier groupe de parite sur 

le bloc de reserve de la (N+2)ieme unite de stockage a acces direct en meme temps qu'une operation de 
lecture effectuee par les N unites de stockage a acces direct restantes ou autres unites de stockage a acces 

direct ; 

50 (3) continuer a combiner et ecrire logiquement le bloc perdu a partir du second groupe de parite sur le bloc 

de reserve de la (N+leme) unite de stockage a acces direct en meme temps qu'une operation de lecture 
effectuee par les N unites de stockage a acces direct restantes ; 
et 

55 (4) repeter I'etape (3) jusqu'a ce que chaque bloc stocke sur I'unite de stockage a acces direct defaillante a 

partir des K-K/(N+2) groupes de parite soit recree et reecrit dans un bloc de reserve equivalent sur chacune 
des unites de stockage a acces direct restantes. 
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11. Sous-systeme de stockage pour reconstituer des parties de groupes de parite residant sur une unite de stockage 
a acces direct defaillante, las groupes de parite comprenant chacun N blocs de donnees, P blocs de parite et S 
blocs de reserve, le sous-systeme comprenant : 

5 un agencement forme de N+P+S unites de stockage a acces direct, chaque unite de stockage a acces direct 

stockant K blocs ; 

un premier moyen pour distribuer K groupes de parite (ou K/N+P+S est un nombre entier) dans une adresse 
synchrone a travers les sous-ensembles des N+P unites de stockage a acces direct selectionnees a parlir de 
10 I'agencement d'une maniere telle que deux blocs du meme groupe de parite ne sont pas stockes sur la meme 

unite de stockage a acces direct ; 

un moyen pour distribuer K*S blocs de stockage comme blocs de reserve d'une maniere telle que chaque 
unite de stockage a acces direct de I'agencement reserve K*S/N+P+S blocs de reserve sur celle-ci ; 

15 

un moyen d'identification pour identifier chaque unite de stockage a acces direct defaillante ; et 

un moyen repondant a la defaillance de I'unite de stockage a acces direct defaillante identifiee par le moyen 
d'identification pour traiter chacun des K K*S/N+P+S groupes de parite de I'unite de stockage a acces direct 
20 defaillante en regenerant le bloc de donnees perdu ou le bloc de parite perdu du groupe de parite de ladite 

unite de stockage a acces direct defaillante a partir des blocs de donnees et de parite restants dudit groupe 
de parite et ecrire le bloc regenere dans le bloc de reserve dudit groupe de parite d'une maniere telle que 
deux blocs du meme groupe de parite ne sont pas distribues sur la meme unite de stockage a acces direct. 

25 12. Sous-systeme de stockage selon la revendication 11, dans lequel les K*S blocs de reserve sont distribues d'une 
maniere telle que deux blocs n'occupent pas la meme adresse de I'agencement et la meme unite de stockage a 

acces direct. 

13. Sous-systeme de stockage selon la revendication 11 ou la revendication 12, dans lequel P=S=1 et sur defaillance 
30 d'une seule unite de stockage a acces direct et reconstitution desdits groupes de parite, seul K-(K/(N+2)) empla- 
cements memoires de blocs sont rendus indisponibles pour utilisation de I'agencement. 

14. Sous-systeme de stockage selon la revendication 13, dans lequel la parite de chacun des K-K(N+2) groupes est 
ecrite dans une unite de stockage a acces direct specialisee parmi les N+2 unites de stockage a acces direct. 

35 

15. Sous-systeme de stockage selon la revendication 13ou la revendication 14, dans lequel chaque unite de stockage 
a acces direct comprend un moyen de memorisation a pistes cycliques constituee de m pistes ; et un moyen pour 
se deplacer de piste a piste et lire ou ecrire les blocs de donnees ou de parite selectivement le long d'une ou 
plusieurs pistes ; et dans lequel de plus le moyen pour combiner et ecrire logiquement les K groupes de parite 

40 comprend : 

un moyen pour positionner le moyen de deplacement a un emplacement predetermine sur le moyen de me- 
morisation a pistes cycliques de chacune des N-i-1 unites de stockage a acces direct restantes et pour se 
deplacer sur la totalite des m pistes en commengant a partir de I'emplacement predetermine ; 

45 

un troisieme moyen, au debut du deplacement, pour combiner et pour ecrire logiquement le bloc perdu du 

premier groupe de parite sur le bloc de reserve de la N+2ieme unite de stockage a acces direct simultanement 
avec une operation de lecture effectuee par les N autres unites de stockage a acces direct restantes ; et 

50 un quatrieme moyen incluant le troisieme moyen pour continuer a combiner et a ecrire logiquement le bloc 

perdu du second groupe de parite sur le bloc de reserve de la (N+lieme) unite de stockage a acces direct 
simultanement a une operation de lecture effectuee par les N unites de stockage a acces direct restantes et 
pour repeter la combinaison et I'ecriture jusqu'a ce que chaque bloc stocke sur I'unite de stockage a acces 
direct defaillante a partir des K groupes de parite soit recree et reecrit dans un bloc de reserve equivalent sur 

55 les unites de stockage a acces direct restantes. 

16. Sous-systeme de stockage selon I'une quelconque des revendications 11 a 15; dans lequel 8=2, ledit moyen de 
distribution distribuant la capacite equivalente a jusqu'a 2*K blocs de stockage comme blocs de reserve a travers 



18 



EP 0 518 603 B1 



ragencement des N+P+2 unites de stockage a acces direct d'une maniere telle que pas plus que deux emplace- 
ments memoires de reserve ni pas plus qu'un bloc de parite sont stockes a la meme adresse synchrone ou sur la 
meme unite de stockage a acces direct. 

5 17. Sous-systeme de stockage pour reconstituer des parties de groupes de parite residant sur I'unite de stockage a 
acces direct defaillante, chaque groupe de parite comprenant N blocs de donnees, P blocs de parite, le sous- 
systeme comprenant : 

un premier et un second agencement independants de la defaillance formes chacun a partir d'au moins N+P+1 
10 unites de stockage a acces direct, chaque unite de stockage a acces direct ayant la capacite de stocker K 

blocs ; 

un premier moyen pour distribuer K groupes de parite (ou K/(N+P+1) est un nonnbre entier) a travers des 
N+P+1 unites de stockage a acces direct de I'un des premier ou second agencement mutuellement exclusi- 
15 vement d'une maniere telle que deux blocs du meme groupe de parite ne sont pas stockes sur la meme unite 

de stockage a acces direct ; 

un moyen pour distribuer K blocs de stockage comme blocs de reserve sur les N+P+1 unites de stockage a 
acces direct du premier agencement et K blocs de stockage comme blocs de reserve sur les N+P+1 unites 
20 de stockage a acces direct du second agencement d'une maniere telle que dans chaque agencement un seul 

bloc de stockage se trouve a chaque adresse synchrone et sur chaque unite de stockage a acces direct ; 

un second moyen pour identifier une premiere defaillance ou une seconde defaillance d'une unite de stockage 
a acces direct se produisant dans le premier ou le second agencement ; 

25 

un moyen sensible a chaque defaillance particuliere d'unite de stockage a acces direct identifiee par le second 
moyen pour traiter chacun des K-K(N+P+1 ) groupes de parite de I'unite de stockage a acces direct defaillante 
en regenerant le bloc de donnees ou de parite perdu de chaque groupe de parite de I'unite de stockage a 
acces direct defaillante a partir des blocs de donnees et de parite restants dudit groupe de parite et ecrire le 
30 bloc regenere dans le bloc de reserve dudit groupe de parite d'une maniere telle que deux blocs du meme 

groupe de parite ne se trouvent pas distribues sur la meme unite de stockage a acces direct ; et 

un moyen sensible a chaque seconde defaillance d'unite de stockage a acces direct identifiee dans le meme 
agencement ou autre agencement par le second moyen pour traiter, dans I'agencement comportant des blocs 

35 de reserve disponibles, chacun des K-K/(N+P+1 ) groupes de parite de la seconde unite de stockage a acces 

direct defaillante en regenerant le bloc de donnees ou de parite perdu de chaque groupe de parite de I'unite 
de stockage a acces direct defaillante a partir des blocs de donnees et de parite restants dudit groupe de 
parite et ecrire le bloc regenere dans un des K*(N+P)/(N+P+1) blocs de resen/e restants d'une maniere telle 
que deux blocs du meme groupe de parite ne seront pas distribues sur la meme unite de stockage a acces 

40 direct. 

18. Sous-systeme de stockage selon la revendication 17, dans lequel le moyen sensible a ladite defaillance de la 
seconde unite de stockage a acces direct maintient le sous-systeme dans un etat sans perte d'informations seu- 

lement si la seconde defaillance se produit apres que le moyen sensible a la premiere defaillance ait reconstitue 
45 les groupes de parite de I'unite de stockage a acces direct defaillante et les ait ecrits dans des blocs de reserve 

equivalents. 

19. Procede pour reconstituer des parties de groupes de parite residant sur une unite de stockage a acces direct 
defaillante dans un sous-systeme de stockage comprenant un premier et un second agencement independants 

50 de defaillance de I'unite de stockage a acces direct, chaque unite de stockage a acces direct ayant les capacites 

de stocker K blocs, chaque groupe de parite comprenant N blocs de donnees et P blocs de parite, comprenant 
les etapes consistant a : 

(a) configurer un premier agencement constitue des N+P+1 unites de stockage a acces direct et un second 
55 agencement constitue de N+P unites de stockage a acces direct ; 

(b) distribuer K groupes de parite (ou K/(N+P+1) est un nombre entier) a des adresses d'agencement syn- 
chrones a travers les sous-ensembles des N+P unites de stockage a acces direct parmi les N+P+1 unites de 
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stockage a acces direct du premier agencement et K groupes de parite sur les N+P unites de stockage a 
acces direct du second agencement d'une maniere telle que deux blocs du meme groupe de parite ne se 
trouvent pas sur la meme unite de stockage a acces direct et distribuer K blocs comme stockage de reserve 
a travers le premier agencement d'une maniere telle que chaque unite de stockage a acces direct comprend 
K/(N+P+1 ) blocs de reserve sur celui-ci, chaque adresse synchrone et unite de stockage a acces direct ayant 
seulement un bloc de reserve sur celui-ci ; 

(c) dans le cas d'une defaillance d'une seule unite de stockage a acces direct se produisant dans le premier 
agencement, pour chacun des K-K(N+P+1) groupes de parite, regenerer le bloc de donnees ou de parite 
perdu du groupe de parite de ladite unite de stockage a acces direct defaillante a partir des blocs de donnees 
et de parite restants dudit groupe de parite et ecrire le bloc regenere dans un bloc equivalent des K*(N+P)/ 
(N+P+1 ) blocs de reserve restants d'une maniere telle que deux blocs du meme groupe de parite ne sont pas 
distribues sur la meme unite de stockage a acces direct ; et 

(d) dans le cas d'une defaillance d'unite de stockage a acces direct se produisant dans le second agencement, 
pour chacun des K groupes de parite, regenerer le bloc de donnees ou de parite perdu du groupe de parite 
de ladite unite de stockage a acces direct defaillante a partir des blocs de donnees et de parite restants dudit 
groupe de parite et ecrire le bloc regenere dans un des K blocs de reserve places sur le premier agencement. 
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Read 6 regions from 4 DASDs 
Write 6 regions to one DASD 
Total # of l/Os = 30 
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Figure 1: Traditional Sparing 
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4+P array with one spare 
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CN FAILURE: 

Read 4 regions from each of 5 DASDs 
Write 1 region to each of 5 DASDs 
Total § of l/Os = 25 
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Figure 2: Distributed Sparing 
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3+P array with two spares 
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Figure 3: Distributed Sparing With 2 Spares Per Array 
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Figure 4: Two Arrays With Distributed Sparing 
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Figure 5: Distributed Sparing Situation 
After One DASD Fails 
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Figure 6: Distributed Sparing Situation 
After Two DASDs Fail 
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Figure 7: Partially Distributed Sparing 
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